Estimation of food volume and carbs

ABSTRACT

There is disclosed a system for estimating the volume of food on a plate, for example meal, with a mobile device. The system uses a camera and a light pattern projector. Images of the food with and without a projected light pattern on it enable to compute the tridimensional shape and volume, while image segmentation and recognition steps estimate one or more food types in said images. By applying accessible knowledge databases, the carbs content is estimated and the associated insulin bolus doses are provided. Developments comprise coding of the light pattern, different light sources and associated wavelengths, motion compensations, additional optics, estimation of fat content and associated multi-wave boluses. The invention can be implemented in a glucometer or in an insulin pump controller provided with a test strip port or with a mobile phone.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No.14/978,433, filed Dec. 22, 2015, which is a continuation of PCTApplication No. PCT/EP2014/063945, filed Jul. 1, 2014, which claimspriority to EP Application Serial No. 13003338.4, filed Jul. 2, 2013,and which disclosures are incorporated herein fully by reference.

FIELD OF THE INVENTION

The present invention relates generally to methods, systems and devicesfor estimating the volume and carbohydrates (in the following referredto as carbs) content of food.

BACKGROUND OF THE INVENTION

Mobile phone associated applications represent a rapidly growing marketthat offers users a number of helpful tools for a wide variety of tasks.Recent achievements in hardware and signal processing have increased theusability of these tools and give users opportunities to carry outaccurate measurements on their own.

In parallel, there is growing demand for portable devices suitable forself-assessment of diet with an especially strong need for patients withDiabetes Mellitus of type I. The total number of persons with Diabetesis estimated to 400 million, and this number will substantially increasein the next few decades. One of the critical tasks for persons withdiabetes is the control of the amount and type of food intake. For them,diet affects glycaemia much more than for healthy individuals. Clinicalstudies have shown that for children and adolescents on intensiveinsulin therapy an inaccuracy of ±10 g in carbs counting does notdeteriorate the post-prandial control, while a ±20 g variationsignificantly impacts the postprandial glycaemia

Food intake estimation is a non-trivial task due to a wide variety offood types and complex irregular shapes of servings. Image processingand computer vision techniques have made some progress, but numerousuncertainties remain and cumulate (recognition of the type of food,estimation of the volume, diverse lighting conditions etc). Regardingthe estimation of volume, which is a key parameter, available 3Dscanning techniques for industrial activities—using medium powerconsumption lasers to scan objects and reconstruct 3D shapes—are notadapted to mobile and personal environments. Industrial lasers requireimportant sources of energy, may be dangerous for the eyes and mayrequire several image acquisition devices.

The patent literature is developed for 3D or volume estimation inindustrial environments, but scarce for mass-market type environments.

EP2161537 discloses a position measuring apparatus including a firstirradiating part that irradiates a first beam to an object, a secondirradiating part that irradiates a second beam to the object, acapturing part that captures images of the object, a processing partthat generates a first difference image and a second difference image byprocessing the images captured by the capturing part, an extracting partthat extracts a contour and a feature point of the object from the firstdifference image, a calculating part that calculates three-dimensionalcoordinates of a reflection point located on the object based on thesecond difference image, and a determining part that determines aposition of the object by matching the contour, the feature point, andthe three-dimensional coordinates with respect to predetermined modelleddata of the object. This system presents drawbacks.

Shang et al: “Dietary intake assessment using integrated sensors andsoftware”, Proceedings of SPIE, Vol. 8304 page 830403, describes asystem consisting of an mobile device that integrates a smartphone andan integrated laser package; software on the smartphone for datacollection and laser control; an algorithm to process acquired data forfood volume estimation, and a database and interface for data storageand management. The laser package creates a structured light pattern, inparticular a laser grid.

The system collects videos with slow movement of the camera around thefood, stabilized at several positions and collects video sequences. Thelaser is turned on and off during the video collection, resulting invideo frames with and without laser grids alternatively. As the motionbetween two adjacent frames is considered small, the laser grid linescan be extracted by subtracting the non-grid images from the gridimages.

Since the smartphone has only moderate computational power it issuitable for data collection but not for volume estimation. Therefore,acquired grid videos are transferred to a server for further processing.Furthermore, the food types are manually identified by the user, whilethe selection of several pairs of images whose motion is small isperformed manually too. Those limit the system's usability.

There is a need for methods, systems and portable devices of estimatingvolume of objects (e.g. food) and to derive food nutritional values andinsulin bolus advice thereof.

SUMMARY OF THE INVENTION

The present invention is related to applications for camera-enabledportable devices (e.g. mobile phones, smartphones) to be used for foodvolume estimation using structured light, for foodsegmentation/recognition by image analysis and for advanced bolussuggestion features. Using the combination of data analysis/processingand variations in surface highlighting, 3D scanning principles areadapted to mobile devices such as mobile phones, which presentparticular image acquisition conditions. Examples teach how to filternoise and signal artefacts. Embodiments of the invention solve technicalproblems including these of miniaturizing of light source to make itpossible to use as an attachable device, optimizing processingalgorithms to fit in the constraints of calculation power on the mobiledevices and achieving effective noise reduction.

This is achieved by an inventive system as described in claim 1.

There is disclosed a system for estimating the volume of an object, saidsystem comprising instructions which when performed by a processorresult in operations comprising: receiving a first image of the objectfrom an image acquisition component; projecting a light pattern on thesurface of the object with a light pattern projection component;receiving a second image of the object being highlighted with said lightpattern from the image acquisition component; subtracting the first andsecond images; identifying the projected light pattern on the object;and computing the tridimensional shape and the volume of the objectgiven the deformations of the projected light pattern.

In an optional development, the irradiation power of the light patternprojection component can be inferior to 5 mW. For ambient lightningconditions, a suitable irradiation power can be superior to 0.5 mW. In adevelopment, the light pattern projection component comprises one ormore light sources chosen from the list comprising a low intensitysemiconductor LASER diode, a LED, an organic LED (OLED), a pre-existingmobile phone flash such as a white LED or a miniaturized halogen lamp ora combination thereof. More preferably the light source is operated incontinuously mode or in impulse mode with pulse frequencies betweenaround 0.1 to 100 Hz even more preferably between 0.5 and 10 Hz.

In a particular embodiment the power consumption of the light source isless than 0.7 W, in particular the power consumption is in range betweenaround 0.05 W and around 0.7 W.

In a development, the system in particular the light pattern projectioncomponent comprises an optical objective adapted to form and projectand/or to focus the light pattern onto the object. Preferably theoptical objective has an irradiation loss inferior to 60% from the totaloutput of the irradiation power of the light source.

In a development, the relative pose and orientation of the light patternprojection component and the image acquisition component is static orinvariant during the image acquisition operation and/or the lightpattern projection operation. In a development, the projected lightpattern is composed of geometrical motifs such as sequences of stripes,or dots, or repetitive graphical elements or of a combination thereof.In particular the geometrical motifs have a bright area the powerdensity at which is inferior to 55 mWt/cm².

In a development, the light pattern is coded by color and/or phaseand/or amplitude modulation. In a development, the coding of the lightpattern is predefined and is synchronized with the image acquisitioncomponent. The total irradiation power for all color components is equalor inferior to the output irradiation of the light source of the lightpattern projection component.

In a development, the coding can be achieved by matching the spectra ofthe one or more sources to the maxima transfer rates of the filter ofthe image acquisition component. In a development, the system furthercomprises an operation of correcting the first and/or the second imageby compensating the movements of the image acquisition component duringthe image acquisition operation. In a further development, thecompensation is performed by processing data received from anaccelerometer and/or a gyroscope and/or a motion sensor and/or amechanical optical component and/or an electrical optical componentand/or a camera tracking device and/or a magnetometer and/or a computervision tracking device associated with the mobile device.

In a development, the compensation is performed by multi-view geometrymethods (e.g. pose solvers using image features) or projective warpingor by piecewise linear, projective, or higher order warping, or bydeconvolution or by oriented image sharpening or by optical flowdetection before or after or in an iterative refinement process with thesubtraction of the images. In a development, the object is food andfurther comprising the operation of performing food recognition of thefirst and/or the second image to estimate one or more food types in saidimage and one or more volumes associated with said food types. In adevelopment, the food recognition operation comprises one or more of theoperations of segmenting the image, identifying color and/or texturefeatures of segmented parts of the image and performing machine learningbased classification for one or more segmented parts of the image or acombination thereof.

In a development, the system further comprises an operation ofestimating one or more nutritional characteristics of the meal capturedin the first or second image by multiplying the estimated volumes of thedetermined food types by unitary volumetric values retrieved from adatabase, said database stored remotely (e.g. cluster, server and/orcomputer network) and being accessed through any available wired andwireless communication channel and/or stored locally on the device,and/or determined from food labels by using optical characterrecognition (OCR) and/or associated with geolocation data and/orprovided by the user. In a further development, the one or morecharacteristics, of the meal or of parts thereof, are one or more ofcarbs content, fat content, protein content, Glycemic Index (GI),Glycemic Load (GL) and/or Insulin Index (II) or a combination thereof.In a development, there is provided an insulin dose recommendationand/or a bolus profile advice based on said one or more mealcharacteristics. In a development, the projection of the light patternand in particular the whole pipeline performed by the system, istriggered by voice command or by gesture command and/or by touchscreencommand and/or keyboards and/or by geofencing and/or geo positioningand/or by following a predefined time schedule. In a development, theimage acquisition component and the light pattern projection componentare embedded in a mobile device such as a glucometer or an insulin pumpcontroller provided with a test strip port or a mobile phone.

In a development, the light pattern projection component is attachableto the mobile device. Preferably the light pattern projection componentis inserted i.e. connected to an electrical contact slot of the system,in particular such as a charge slot or an USB slot preferably of ahandheld device e.g. preferably a smart phone.

In a development, the first or the second image is a video frame and theimage acquisition component is a video camera.

In a development, one or more operations are continuously, and inparticular automatically, repeated until the acquisition of the firstand/or the second image is considered as sufficient based on predefinedthresholds associated with criteria comprising one or more of imagequality, associated measurements of handshakes, time delays betweenstill images or video frames, resulting light pattern or a combinationthereof.

In a further embodiment the image acquisition component has sensitivityinferior to 5 lux.

There is disclosed a computer program comprising instructions forcarrying out one or more of the operations according the presentdisclosure when said computer program is executed on one or moresuitable components. There is also disclosed a computer readable mediumhaving encoded thereon such a computer program.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A and 1B illustrate the practical use of embodiments of themethod;

FIG. 2 is a schematic and exemplary workflow of the method of volumeestimation;

FIGS. 3A, 3B and 3C show examples of an analyzed object, with andwithout a light pattern projected on it, the resulting identification ofthe light pattern and the detected handshake movement;

FIGS. 4A, 4B, 4C, 4D and 4E show examples of the steps of 3Dreconstruction;

FIGS. 5A, 5B, 5C and 5D show some possible variations of the lightpattern structure;

FIG. 6 illustrates the handshakes compensation system;

FIGS. 7A and 7B illustrate the method of extracting the light patternfrom successive images;

FIGS. 8A and 8B illustrate example of binarisation;

FIG. 9 shows the matching process between the edges of correspondingstripes on camera view and projector view;

FIG. 10 shows a curve tracking for consistent matches;

FIGS. 11A and 11B illustrate the image acquisition process;

FIGS. 12A, 12B, 12C, 12D, 12E and 12F illustrate the image segmentationstep;

FIG. 13 illustrates the food recognition step;

FIGS. 14 and 15 illustrate advanced bolus computations (e.g. fatcontent, Glycemic Index);

FIGS. 16A, 16B, 16C, 16D and FIGS. 17A and 17B illustrate the variousoptions of implementation in mobile devices (e.g. mobile phones,glucometers, insulin pump controllers), as extensions or accessories oras natively embedded components.

DETAILED DESCRIPTION

There is disclosed a system for estimating the volume of a food, with amobile device. The system uses a camera and a light pattern projector.Images of the food with and without a projected light pattern on itenable to compute the tridimensional shape, image segmentation andrecognition steps identify and recognise one or more food types in saidimages, while the volumes are computed by using the shape and thesegmentation and recognition results. By applying offline or remotelyaccessible databases, one or more carbs values are estimated and one ormore associated insulin bolus doses are provided. Developments comprisecoding of the light pattern, different light sources and associatedspectral properties, motion compensations, additional optics, estimationof fat content and associated multi-wave boluses. The invention can be,e.g., implemented in a glucometer or an insulin pump controller providedwith a test strip port or a mobile phone.

There is disclosed a device attached or embedded into a mobile phone forretrieval of the three-dimensional structure of surfaces, composed ofthe external miniature unit to generate and/or project low intensitylight pattern or patterns on the surface and data processing algorithm.The device is placed on or in a mobile phone at a predeterminedposition, and controlled by the mobile phone. As a control channel aBluetooth, phone jack or mobile phone flash can be used. The control iscarried out by dedicated software application. The mobile phone softwaretakes two photographs of the object sequentially with and withoutpattern highlighting. Pattern is extracted using the resultingsubtraction of images. Further the 3D shape of the surface and theassociated volume are computed. In order to estimate the carbs contentthe computed shape, along with the results of automatic orsemi-automatic food segmentation and recognition, and nutritionaldatabases are used.

FIGS. 1A and 1B illustrate an example of a practical use of the method,when some particular embodiments are implemented. The device accordingto the invention (comprising one or more image or video acquisitionmeans and one or more projector means) is attached to the mobile phone.Alternatively, such a device is embedded in the mobile phone. In thecase wherein the device is attached to the mobile phone in a fixedposition, the orientation of light patterns is predefined and allows theuse of simplified algorithms on mobile phones. The dedicated applicationcan be launched on the mobile phone by the user. The mobile phone isthen placed in some position oriented towards the dish. The userlaunches the measurements with a virtual control on the display forexample (e.g. pressing a button or activating a touch screen area). Theapplication (“app”) triggers successive sub-steps. It first turns on theirradiation (projection of the light source). Optionally the triggeringcan be achieved by turning on the mobile phone flash. The triggeringalso can be achieved by a command transferred to the device viaBluetooth or phone jack. The optical objective of the device forms alight source beam into a specific pattern. The shape of the pattern canbe predefined (or adaptive, when successive illuminations occur forexample). The projection of the light pattern covers all the field ofview of the mobile phone camera. One picture shot (image acquisition) iscarried out. The application then switches off the illumination andrepeats the shot without highlighting. The application thendistinguishes the different food elements present in the dish, computesthe different volumes and the different associated food categories. Inthe end, the application proposes one or more bolus values to the user.In some embodiments, the segmented image is shown indicating thedifferent recognized parts and the associated carbs values. In somefurther embodiments, the fat content and other nutritional values areestimated as well and a multi-wave bolus can be proposed, to match thekinetics of blood sugar rise with the insulin effect in the body.

In some embodiments, the position of the mobile device can be adjustedusing the screen operating as a view finder, with guarantee ofhighlighting the chosen field of view with the light pattern. In otherwords, by principle the view finder focus the camera on the rightobject, the field of view of the optics being designed to cover thefield of view of the camera.

In some embodiments, steps of the method are performed before theconsumption of the meal and repeated after the meal is consumed. Bysubtracting estimated volumes, further carbs estimation can beconducted. It may well be indeed that the user does not eat the mealcompletely and in such a case, subtractions operations would have to beperformed. Since the food plate can be segmented into several parts(corresponding to different food categories e.g. steak and potatoes),there can be associated estimations of several volumes and following aglobal insulin bolus recommendation can be proposed, or several bolusvalues each associated with the different food parts, said insulin doseor doses corresponding to food actually consumed.

FIG. 2 is a schematic and exemplary workflow of the method of volumeestimation. The preinstalled application (“app” or program) on themobile device comprises a user interface, to allow the user to interactwith the system e.g. to start or to stop the measurements. The userkeeps the mobile phone at a fixed position above the object of interest(e.g. food on a plate). Then, the application obtains automatically twoshots of the object within a minimal timeframe. After the imageacquisition, the program extracts the pattern using the two acquiredimages. This is performed by subtraction of the image which containslight pattern from the simple image without light pattern. The result ofthe subtraction is used to obtain the light pattern and its shape.Before or after or in an iterative process with the subtraction andlight pattern detection, the artefacts that are induced by user handshake are detected and corrected. Handshake generates motion blur, whichcan be formulated as a convolution kernel, this kernel may beautomatically detected and then automatically compensated by using imagedenoising. Furthermore, multi-view approaches allow detecting the motionby using common features on the image sequence; computing the relativepose and orientation. Alternatively, general projective and imagewarping transforms can be applied using common points between theimages, and thus used to map one image onto the other, removingartefacts due to motion. Optical flow is a dense equivalent of imagewarping, which may find a more optimal transform, and provide moreaccurate error compensation. Once the motion is detected it is correctedby image remapping.

Several methods can be used for 3D reconstruction.

For example the 3D reconstruction from light patterns can be performedby computing deformations of individual lines. The gradient also can becounted running along each line and the depth of the scene can becomputed by triangulation. Triangulation consists in using the knowntriangular connectivity of projecting points on multiple cameras toretrieve their position in space. Considering the projective geometry ofpoints in space, the projection of a point in space is on the straightline joining this point and the camera center. When in presence of aplurality of cameras, the segments joining a point to two cameras andthe segment joining the two camera centers form a triangle and theprojections of the point are on the two segments joining point andcamera. Said triangulation uses the projections of a point toreconstruct the corresponding triangle, where the third corner is thepoint in the 3D scene. This step is applied to all correspondencesbetween images. A cloud of point is obtained, representing the scenedepicted in the images. The identification of the correspondingprojections of points in the different images, called <<pointcorrespondences>>, is facilitated by the use of the projected lightpattern according to different embodiments of the invention. Forexample, using stripe patterns, finding point correspondences inmultiple images is simplified compared to the task of findingcorresponding stripes in multiple images. A single reference stripe isnecessary to obtain matches for all points, by propagating stripeindices away from the reference (‘horizontally’) in both imagessimultaneously. By propagating the indices along stripes (‘vertically’),gradual occlusions within stripes can be handled. The step ofpropagating horizontally increases the stripe index every time a blackand white boundary is crossed. The step of propagating stripes indicesvertically keeps the same index for the points in a connected component.By choosing the minimum propagated index at each point, the right stripematching is obtained (unless there is a collection of large objects inthe foreground occulting the objects of interest but in practice thisdoes not happen when scanning food or similarly when scanning simplescenes on a flat plane).

FIGS. 3A-3C show examples of an analyzed object, with and without alight pattern projected on it, the resulting identification of the lightpattern and the detected handshake movement. FIG. 3A shows the object(e.g. chicken wing) with the light pattern projected on it (i.e. thebright stripes). FIG. 3B shows the object without highlighting. FIG. 3Crepresents the result of the subtraction between the two images,resulting in the identification of the light pattern. The movement ofcamera can be observed due to the hand shake (marked on the image by anarrow).

FIGS. 4A-4E show examples of the steps of 3D reconstruction, includingthe subtraction of images, the noise removal step and the 3D shapecomputation steps. FIG. 4A and FIG. 4B show the test images with andwithout light pattern. FIG. 4C shows the subtraction result of theoriginal images (A) and (B) without “hand shake”. FIG. 4D shows theprocessing of light patterns and making them 1 pixel thin. FIG. 4E showsthe 3D model computed from the deformation of lines.

FIGS. 5A-5D shows some possible variations of the light patternstructure, which are examples only. The shape of the elements of thepattern can give the information about the distance to the object, asfar as its deformation about the local depth. The light patterns can beobtained from the light sources directly, from replaceable or predefinedmasks in or non-optical elements in the device attached to the mobile ornatively embedded in the mobile device. For example, the mobile phoneflash can contain an optical element, amplitude zone plate that formsthe light pattern. The structure of the pattern can vary as shown bysome examples in (FIG. 5A) stripes, (FIG. 5B) repetitive elements, (FIG.5C) dots or even (FIG. 5D) mixtures thereof.

The 3D information extraction according to the present method issimplified over techniques analysing the reflection of light patterns onanalysed surfaces. In a preferred embodiment, the 3D information isobtained by encoding the surface using static light patterns. Still,according to some further developments of the present methods, it can bepossible to move or rotate the light pattern(s) during image/videoacquisition (acquiring more images for example) which additional stepsmay enable to save binary information on stripes of the surface (andfollowing to deduce information on the texture and nature of the partsof the surface being analysed).

FIG. 6 illustrates the handshakes compensation system. While a fixedcamera (i.e. with stable conditions in space during image acquisition)leads to accurate volume measurements, some embodiments of the inventionenable image acquisition in a mobility situation (i.e. with relativelyunstable conditions in space during image acquisition). When the mobiledevice is held by a person, the hand of the user can shake while imageacquisition is performed (while obtaining the snapshots of the surface).

According to some embodiments of the invention, the detected movementsof the device due to handshakes can be taken into account in order tocompensate these movements and in fine to reduce noise artefacts. Thedescribed method advantageously leverages the accelerometer and/ormotion sensor data measured and available in and for the device. Themethod therefore can go beyond the handling of pure pattern linesextraction, by taking into account compensation data, namely artefactsgenerated by the handshake. As a result, the accuracy of the volume isincreased. Without compensation, the artefact lines due to handshakeswould otherwise participate in less accurate (but still valuable)three-dimensional reconstruction.

Various mechanisms or components can be used to compensate handshakes:motion sensors (for example accelerometer or gyroscope), mechanical(and/or electrical) optical elements for image stabilization. In apreferred embodiment, the existing optical elements of a mobile phoneare used and handshakes are compensated by software corrections on thebasis of accelerometer or motion sensor data. An accelerometer is adevice that measures proper acceleration. Single- and multi-axis, e.g.3-axis, models of accelerometer are available to detect magnitude anddirection of the proper acceleration (or g-force), as a vector quantity,and can be used to sense orientation (because direction of weightchanges), coordinate acceleration (so long as it produces g-force or achange in g-force), vibration, shock, and falling in a resistive medium(a case where the proper acceleration changes, since it starts at zero,then increases). Micro machined accelerometers are increasingly presentin portable electronic devices and video game controllers, to detect theposition of the device or provide for game input. In commercial devices,piezoelectric, piezo-resistive and capacitive components are commonlyused to convert the mechanical motion into an electrical signal. Anaccelerometer and/or a gyroscope can be used to compensate handshakes.An accelerometer alone can be used. A gyroscope alone can be used. Acombination of an accelerometer with a gyroscope can be used. Agyroscope allows the calculation of orientation and rotation. Suchcomponents have been introduced into consumer electronics and arereadily available in devices. The integration of the gyroscope hasallowed for more accurate recognition of movement within a 3D space thanthe previous lone accelerometer within a number of smartphones.Gyroscopes in consumer electronics are frequently combined withaccelerometers (acceleration sensors) for more robust direction- andmotion-sensing. Other mechanisms can be used: camera tracking,magnetometers, position tracking by computer vision.

In addition, one or more additional optical elements can be used tocorrect handshakes or vibrations. Some noise cancelling sensors may beused indeed. For example, further additional light pattern(s) can beintroduced, and their deformed dimensions and/or intensity of reflectedlight further analysed. Such additional optical elements can beimplemented natively in a specific mobile device, e.g. in the remotecontrol of an insulin pump, or can be implemented through an extensiondevice to be connected to a standard mobile phone or remote control.

Electromechanical optical arrangements can also be used to compensatefor pan and tilt (angular movement, equivalent to yaw and pitch) of acamera or other imaging device. It is used in image-stabilizedbinoculars, still and video cameras, and astronomical telescopes. Withstill cameras, camera shake is particularly problematic at slow shutterspeeds (in the dark).

With video cameras, camera shake causes visible frame-to-frame jitter inthe recorded video. In video embodiments, real-time digital imagestabilization can be used, by shifting the electronic image from frameto frame of video, enough to counteract the motion. Such a techniqueuses pixels outside the border of the visible frame to provide a bufferfor the motion. This technique reduces distracting vibrations fromvideos or improves still image quality by allowing one to increase theexposure time without blurring the image. This technique does not affectthe noise level of the image, except in the extreme borders when theimage is extrapolated.

Further embodiments for handshake compensation are now discussed.According to another embodiment, handshake and/or hand movements duringmeasurement can be approximated by a homography (a projectivetransformation from one image to the other). When changing the point ofview in a 3D scene, image deformations of objects are projective forplanes and non-projective for all non-planar objects. However, if themovement of the camera is small, non-projective transformation in theimage are also small in amplitude and can be approximated by projectivetransformations. Furthermore, the requirement is not an exact matchbetween pixels but a near-exact match in colors and neighborhoods.

In another embodiment, image warping is used for correcting shakesbetween images. Warping is done by finding sparse pixel correspondencesbetween two images, tessellating the images (triangulation,quadrangulation, etc.), and applying local transformations to eachelement to map them to their corresponding element in the second image.

In another embodiment, the image features are extracted from theacquired images, the correspondences between features are estimated andused to compute the relative camera poses before and after the handshake(with 5-, 7- or 8-point algorithm). Once all the images are rectified,transformations (e.g. shift, rotation, zoom) are applied to therectified images to fit each element from one image to the correspondingelement of the other.

In another embodiment, optical flow is used. In such a development, eachpixel in one image is mapped to the corresponding pixel in anotherimage. The motion can be detected using global methods (phasecorrelation approach) or local (e.g. sum of squared differences, sum ofabsolute differences) or with differential methods (Lucas-Kanade,Horn-Schunk methods etc).

In another embodiment, one or more steps of de-blurring are applied. Forexample, inverse convolution with a linear shift invariant filter isapplied to an image to increase details. The choice of the one or morefilters depends on the application. In a preferred embodiment, thefilter can be made of spatial translation (motion blur) and of arandomized component (shake).

It is to be noted that these methods of compensations can be combinedand they leverage or benefit from the existing internal hardware ofmobile phones. For example, the path (in space) taken by the mobilephone during the course of the measurement is a linear multiple of timeplus the double integral of accelerometer measurements over time. Theseries of gyroscopic measurements indicates the orientation of thedevice at all points in time. Such information can allow the computationof the change in viewpoint. This in turn facilitates the application ofthe different embodiments or methods previously described.

ΔP, the difference in positions, is defined as:

ΔP(t ₁ ,t ₀)=(v(t ₀)*(t ₁ −t ₀)+∫∫_(t0) ^(t1) a(t)δt ²)

Where:

-   -   v(t₀) is the original speed of the device, it is unknown and        assumed to be 0,    -   a(t) is the acceleration vector at a given time, it can be        obtained from inbuilt accelerometers    -   t₁−t₀ is the time between two image-captures as can be obtained        from the system clock.

Orientation can be obtained by:

gyroscopic measurements, if available;

integrating rotational accelerations from accelerometer data.

This data can be retrieved from 3 distinct accelerometer data streams,and the knowledge of their placement in the device.

FIGS. 7A and 7B illustrate the method of extracting the light patternfrom successive images. The figure shows the image acquisition procedurewhich consists of a sequence of acquisition of two images (A) onehighlighted with a light pattern, and (B) one normal, i.e. without theprojected light pattern. On FIG. 7A, the external light source isattached to a mobile phone device (or is an integral part thereof). Itgenerates a coded light pattern. The pattern is projected on theinvestigated surface (in the present case on served food). Coding can beperformed by spatial, phase and/or amplitude modulation, i.e. intensityor colour change. The highlighted area should at least correspond tomobile phone camera field of view (FOV). This requirement implies thatthe pattern safely covers the object of interest and partly covers thearea of the table around it. The user then launches the specialisedapplication (pre)installed on his/her mobile phone and holds the phonein a position and angle that ensure that the entire meal is within thecamera's FOV, e.g. the mobile phone can be placed parallel to the table,at a distance of about 30 cm. The application triggers the externallight source (pattern projector) via the mobile phone flash, Bluetooth,phone jack etc. An image of the dish highlighted by the pattern isacquired. On FIG. 7B the application turns off the external light sourceand another picture is taken. The two images can be acquired withminimum delay, in order minimize camera shake and motions blur. Motioncorrection can be performed using deconvolution methods (computationalexpensive) or oriented image sharpening.

Once image acquisition is completed, a high contrast image of the lightpattern is extracted. This operation is achieved by using pixel-wisesubtraction of the two images:

I _(sub)(x,y)=I _(lp)(x,y)−I(x,y)

where x and y are the coordinates of pixels, I_(sub)(x,y) is theresultant intensity at coordinates (x,y). I_(lp)(x,y) and I(x,y)correspond to the colour intensities of pixels in the image thatcontains the light pattern and the image without the light pattern.Subtraction can be performed for all colour channels, or only for one iflaser source wavelength is known and unique. Next step is edge detectionusing binarisation and/or other methods.

FIGS. 8A and 8B show examples of binarisation: FIG. 8A shows noisycaptured stripe pattern and FIG. 8B shows the resulting binary flagsafter threshold selection (black—0, white—1). To reinforce the edges inI_(sub) before detection, several methods can be applied, e.g. simplebinarisation (conversion to black-white) of I_(sub) given a fixedthreshold:

${I_{bin}\left( {x,y} \right)} = \left\{ \begin{matrix}{1,{{{if}\mspace{14mu} {I_{sub}\left( {x,y} \right)}} > {threshhold}}} \\{0,{{{if}\mspace{14mu} {I_{sub}\left( {x,y} \right)}} \leq {threshold}}}\end{matrix} \right.$

where I_(bin)(x,y) is the binary value of the pixel with coordinates xand y. The result of such an operation is the reflected light pattern(FIG. 8B). Threshold value can be determined adaptively from statisticalanalysis of the image (mean, median, histogram search). Pixel jitter canbe removed by selecting the modal binary flag over a small neighbourhood(no more than a stripe's width). Edges are extracted from a binary imageby keeping only pixels of one flag with a direct neighbour of anotherflag (e.g. black pixel with white neighbour or white pixel with blackneighbour). Without binarisation, edges are extracted using the Canny orSobel operators, phase coherence methods, or wavelets.

Coding of the light pattern is now discussed.

In a development, amplitude and phase modulation of the highlightingirradiation can be used, in order to achieve better recognition of lightpatterns on the images.

In embodiments where the light pattern is coded, two furtherdevelopments can be used. The first is to use Bruijn code (color codingthe stripes) and the second one is to use time coding. Polarization,phase modulation and amplitude modulation also can be used. Thesedevelopments allow the minimization of occlusions that may appear inimage acquisitions. They can be used in combination.

FIG. 9 shows the matching process between the edges of correspondingstripes on camera view and projector view in a parallel stereo setup.The resulted image (camera view) is scanned by a vertically-movinghorizontal line. In addition, FIG. 9 presents the correspondencesbetween the camera and the projector views which are detected.

FIG. 10 shows an optional step of curve tracking for consistent matches.Follow the (oriented) normal to the gradient (or go through connectedneighbourhoods) to reconstruct the curve. Consistent matching can beobtained by tracking the curves from one end of the image to the other.Alternatively, a propagation method can be used.

FIGS. 11A and 11B illustrate the design of the structured light setupcomprised by an imaging sensor (camera), a structured-light projectorand the object of interest. A light pattern is projected (via theprojector) to the object of interest, while the camera acquires thecorresponding object image (highlighted by the projected pattern). Depthinformation can be obtained by measuring the distortion between thecaptured image and the reflected image. The most simple and commonlyused pattern is a sequence of stripes, while other coded light patternsare applicable. It has to be noted that the proposed solution refers toa system of known camera-projector configuration. FIG. 11A shows thedepth estimation of a 3D scene point (SP) knowing the projections ineach image (P1 and P2), as well as the centers of the camera (CC) andthe projector (PC). Knowing the system configuration (relativetranslation and orientation between camera and projector, internalprojector parameters and internal camera parameters), as well as thealready detected correspondences, the depth/relative position of a 3Dpoint (SP) can be estimated. Specifically: i) The detectedcorrespondences specify the projections of the 3D points to each of theimages (P1, P2); ii) the projection of the SP to the Camera image is inthe line joining the point and the camera center (CC). The same appliesto the projection of the 3D point to the Projector image. Those twolines intersect by definition to SP; iii) knowing the CC and PC, as wellas the two projections of the 3D point (P1 and P2), the depth of the SPcan be estimated.

According to one embodiment, a single image can be acquired (a singleimage of the object along with the projected light pattern on it). Insuch a case, a threshold filter is applied and local minima and maximaare determined. Such a method implies both a powerful light source (forachieving high contrast) and more computations (computing power).According to some other embodiments, a plurality of images is acquired.An optional phase modulation can be applied to highlight one or morefringes of the light pattern area.

In a preferred embodiment, two images are successively acquired andimages are subtracted to identify the deformated light pattern. Thissolution is advantageous in some circumstances. This process enables tocalculate the local reflectance of the considered object with a betteraccuracy. Less powerful source of light can be used (correlativelydecreasing the level of danger associated with laser power). Theassociated methods and systems are also more robust to changes inexternal illumination conditions (ambient light), which can changesignificantly (also over time). Food elements often exhibit a variety oflight reflection properties, implying the benefit of an adaptivethreshold estimation with respect to local level of reflected light, asit is the case when acquiring two successive images.

FIGS. 12A-12F illustrate the image segmentation step. Differentstate-of-the art image segmentation techniques can be used. In oneembodiment, the segmentation algorithm consists of three steps:mean-shift filtering, region growing and region merging. FIGS. 12A-12Fshow several steps for the segmentation-recognition step: (a) theoriginal image, (b) the smoothed images after mean-shift filtering, (c)the region growing result, (d) image segments after merging. First theoriginal image (FIG. 12A) is converted to the CIELAB color space whichis considered perceptually uniform. Then mean-shift filtering is appliedin a pyramidal way (FIG. 12B). A Gaussian pyramid is built, thefiltering is applied to the smaller scale and then the results arepropagated to the larger scales. Then, a region growing method detectsthe initial segments based on their color homogeneity (FIG. 12C)followed by a region merging stage where segments with size smaller thana threshold are merged to the closest neighbor in terms of average color(FIG. 12D). Finally the plate is detected after detecting the imageedges and combining them to get the best possible ellipse with RANSACalgorithm (FIG. 12E). Segments with a large part outside the plate orsharing a big part of boarder with the plate are eliminated. At the endof the segmentation stage, some of the food items might still beover-segmented to some extent. However, if the generated regions arelarge enough in order to be recognized, the final segmentation resultwill be further improved after merging regions with the same food label(FIG. 12F).

FIG. 13 illustrates the food recognition step. After segmenting the mealimage each of the created segments has to be recognized so a food labelwill be assigned to it. The food image recognition comprises two steps:image description (a set of characteristics describing the visualcontent of the image is extracted and quantified) and imageclassification (one or more classifiers assign to the image one or moreclasses out of a pre-defined set of food classes). Both steps requiretraining (the system learns from the acquired knowledge) and testing(the system recognizes food types from new, unknown images). Each of theimage segments produced by the previous stage is described using colourand texture features and then classified by a machine-learning basedclassifier into one of the pre-defined food classes. The classifier haspreviously been trained on a large training dataset of image patchesbelonging to the considered food classes. Support Vector Machines (SVM)constitute the most popular classification solution while nearestneighbor, probabilistic approaches and artificial neural networks canalso be used.

In a preferred embodiment, both color and texture features can be usedand then classified by a machine learning-based classifier into one ofthe pre-defined food classes. The classifier has previously been trainedon a large training dataset of images belonging to the considered foodclasses. The histogram of a pre-clustered color space can be used as thecolor feature set. A hierarchical version of the k-means algorithm canbe applied to cluster the color space created by the training set offood images, so that the most dominant food colors are determined. Theuse of the hierarchical k-means instead of the original k-means providesefficiency during the calculation of features since it creates a tree ofhierarchical clusters with a branch factor of 2. The initial color spaceis split into 2 clusters and each of them is iteratively split in twountil the required number of colors is reached. In some cases, a set of512 colors can be considered as sufficiently descriptive. Afterclustering the image colors, their histogram is created and everyhistogram value is treated as a feature. For texture features, the LBPoperator can be used. LBP operator is a non-parametric operatormeasuring the local contrast on grey scale images for efficient textureclassification. The LBP operator consists of a 3×3 kernel where thecenter pixel is used as a threshold. Then the eight binarized neighborsare multiplied by the respective binomial weight producing an integer inthe range [0 . . . 255]. Each of the 256 different 8-bit integers isconsidered to represent a unique texture pattern. Thus, the LBPhistogram values of an image region describe its texture structure.Hence, a color and texture feature vector of 512+256=768 dimensions iscreated and fed to the classifier that will assign to the segment one ofthe predefined food classes.

FIGS. 14 and 15 illustrate advanced bolus computations (e.g. fatcontent, Glycaemic Index). FIG. 14 shows an example of food labelling.Such data is generally available and can be retrieved with networkconnection or stored offline. Such data can be accessed by means of 2Dbar code scanning. In other words, the image acquisition component canbe used to capture the image of the meal, to get direct information onthe food about to be consumed, but also can serve indirect purposes suchas data or metadata on food, via 2D bar code scanning (or others like QRcodes for example).

As described, one of the purposes of the camera is to capture images ofthe food about to be consumed, and to use advanced algorithms to processthe images to identify both the type and the quantity of the food. Inpresence of an additional use of a blood glucose monitor device or ofcontinuous glucose monitor device for example, further additionaldevelopments of the invention are enabled. By image recognition and/orby the use of such devices, further meal characteristics can be obtainedand these parameters can substantially help to tailor boluses to achievebetter glycaemic control.

For example, after types of food have been identified (i.e., bread,pasta, potatoes, etc.), estimates of other aspects of the meal such asthe fat and protein content can be made. This information could beobtained from publicly available databases, or also from a scan of thenutrition chart of the particular meal. The camera can be used to OCRthe available nutritional label if any, or data can be retrieved usingRFID or (2D) bar code scanning or the like. From the fat and proteininformation there is the possibility of determining whether these mealsare slow or fast. The terms “slow” and “fast” refer to the speed withwhich food causes the blood glucose value to rise and the duration overwhich glucose continues to go into the bloodstream. A slow meal is onethat has a much longer time to attain peak glucose value as compared toa normal (lean) meal, so that its time of overall action is much greaterthan the standard 4 hours. Conversely, a fast meal has a much fasterpeak glucose value and its time of action is much faster than 4 hours. Aschematic of meal speeds is provided. It is important to balance theinsulin action profile with the meal activity profile to ensure propercontrol. A standard insulin bolus administered at the beginning of themeal, or a few minutes before the meal, is designed to handle the fastblood glucose rise caused by carbs in a fast meal. If after four hoursthe blood glucose is continuing to rise and the rise continues for sixhours then the meal would be classified as a slow meal. In a slow mealthe blood glucose rise the first hour would be relatively modestcompared to that of a fast meal. Therefore for a slow meal, a standardinsulin bolus may provide too much blood glucose reduction in the firsttwo hours, perhaps causing hypoglycaemia, but not enough reduction inthe 4 to 6 hour timeframe, leading to hyperglycaemia several hours afterthe meal.

By using the image of the meal, the system can recognize that this mealis “the same” or “substantially similar” to prior meals that the userhas eaten. If the pre- and post-meal blood glucose values are availablefor the prior experiences with this meal, and there is a clear trend ofdeviation from desired glycaemic control, then adjustments for thepresent meal dosing and insulin can confidently be made to get animproved glycaemic response. Along these lines, the system can monitorthe relative sizes of these same meals and develop a histogram of thevariation in the same meals size. It is known that the size variationsusually fall into a small discrete number of buckets (U.S. Pat. No.7,941,200B2). If the system detects that such is the case, then theuser's specific customized bolus sizes and delivery types could evolvefrom observing the meal consumption, insulin doses given and glycaemicresponses of the user when consuming these “same” meals.

Monitoring the blood glucose values at 2 and 4 hours after theconsumption of the meal can give a clear idea about the speed and thecarbs content of the meal. Alternatively, if this system includes acontinuous glucose monitoring device, then the uncertainty with respectto pre- and post-meal responses could be removed as a much more highfidelity measurement of the glucose profile is available.

An additional benefit of using this intensive approach is to determinewhether the patient's therapy parameters need to be altered. If it turnsout that the patient is having difficulty controlling glucose excursionsonly after specific meals, then it would suggest that the bolusdetermination would need to be addressed. If however, the patient ishaving consistent post-prandial excursions as verified by glucoseprofiles of all meals, then it would suggest that additionally, thepatient therapy parameters would also need to be altered.

Further embodiments of the method handle parameters such as GlycaemicIndex (GI) and/or Glycaemic Load (GL) and/or Insulin Index (II). Suchvalues can be associated with a meal consumed by the patient. Theglycaemic index, or glycaemic index, (GI) provides a measure of howquickly blood sugar levels (i.e., levels of glucose in the blood) riseafter eating a particular type of food. A related measure, the glycaemicload, multiplies the glycaemic index of the food in question by thecarbs content of the actual serving. The Insulin Index is a measure usedto quantify the typical insulin response to various foods.

These GI, GL or II values can be given (e.g. by the restaurant or thelabelling) or can be computed or estimated by image matching, orsimilarity search. Such values can also be directed entered by thepatient, who can be invited to enter more or less meal-relatedinformation (for example, a meal speed value, corresponding to the speedat which the meal is consumed, a total glycemic index of the meal, mealsize in terms of fat content, carbs content, protein content, etc). Suchdata also can be reconstructed from image or video analysis (for examplemeal speed can be derived from the video). The querying process can beconfigured to require a patient to enter in absolute estimates (e.g.,“small”) or in relative terms (e.g. “smaller than normal”).

In some embodiments, steps of the method can be performed before theconsumption of the meal and repeated after the meal is consumed. Bysubtracting estimated volumes, further carbs estimation can be handled.It may well be indeed that the user does not eat the meal completely andin such a case, subtractions operations would have to be performed.Since the food plate can be segmented into several parts (several foodcategories e.g. steak and potatoes), there can be associated estimationsof several volumes and following an insulin bolus recommendation can beproposed, which dose correspond to food actually consumed.

FIGS. 16A-16D and 17A-17B illustrate the various options ofimplementation in mobile devices (e.g. mobile phones, glucometers,insulin pump controllers), as extensions or accessories or as nativelyembedded components. Embodiments of the invention can be implemented invarious devices, in particular in medical (regulated) devices (e.g.glucometer or insulin pump controllers) or in standard and mass-marketconsumer electronics devices (mobile phones, tablets, computers). In apreferred embodiment, the implementation occurs in a glucometer. Such aglucometer can be provided with a light source with sufficient power andan adapted energy management profile. In such a case, for example, asecond dedicated battery can be provided as source of energy for lightillumination. Alternatively a dynamo system can replace or supplementthe first and/or second battery to render the system more autonomous.The glucometer can further be adapted to control an insulin pump. Apatient thus has a convenient solution to measure blood glucose, controlthe pump, estimate volume and carbs value of food, provide bolus adviceand deliver insulin.

FIG. 16A shows the back view of mobile phone with an attached deviceaccording to the invention: 1) represents the objective of light source.2) represents the aperture for mobile phone camera, 3) the light sourceunit. FIG. 16B shows the front view of mobile phone with attacheddevice: 4) represents the clips for attaching the device and fixing iton mobile phone. FIG. 16C shows the scheme for attaching the device tothe mobile phone: 1) represents the aperture for the camera, 2)represents the photodiode of the device.

The light source of the device can be developed based on a low intensitysemiconductor laser irradiation source (FIG. 16A, 3). For example, thesource can include one or several laser diodes. In case of severaldiodes being implemented, the irradiation of each diode is mixed withirradiation of the others and the integral output light is received bythe objective (FIG. 16A, 1). In this case the irradiation wavelength ofseveral diodes can be different. As an example, three laser diodes(commercially available lasers at low cost) with the wavelengths of 445nm, 530 nm and 810 nm can be used. It is known that CCD cameras of allportable devices are shielded by a set of interference filters in orderto reduce the noise in RGB channels. One can choose the wavelengths ofthe laser diodes corresponding to the maxima transfer rates of thesefilters. The mentioned above three wavelengths cover the spectralresponse of filters characteristics for most of the handy CCD cameras.One advantage of this technique is to allow to decrease the power ofeach source compared to the scheme with only one diode. Highlighting indifferent spectral ranges and integrating the corresponding signals fromeach RGB channel can allow to obtain high contrast of light patterns.The approach may demand more complicated optics for the objective (FIG.16A, 1), but this might allow lower power consumption (and smallerbatteries can be used). There are several solutions for wavelengthmixing using a number of monochromatic light sources. For example, thiscan be achieved by directing the laser beam to an inner reflection prismwhich is installed in front of the objective (FIG. 16A, 1). The prismreflects all the light in the direction of the objective andinvestigated surface (e.g. food on a plate). One alternative to theprism is to use a fiber multi-wave mixing. The entrance face is attachedto the diodes, while the exit face is attached to the objective (FIG.16A, 1).

Instead of laser sources, one or more light-emitting diodes (LED) can beused. Modern versions of LEDs are available across the visible,ultraviolet, and infrared wavelengths, with very high brightness. Thewavelengths band of LEDs can be wider than the one of laser diodes. Red,Blue and Green LEDs (or any LEDs in the spectral range of camerasensitivity) are advantageously used in embodiments of the invention.

Besides these LEDs of specific wavelengths, white light LEDs can be used(this kind of LEDs are typically implemented in mobile phone flash).White LEDs emit so called white pseudo colour. The irradiation spectrumof such light sources is not continuous but discrete. The spectrum iscomposed by a mix of blue, red and green wavelength bands, which rendersthe device more compact compared to the previously described laserdiodes solution. In further embodiments, for better performances,additional optics can be added to compensate the diffused output of theLEDs.

Another option is to use Organic LEDs (OLEDs) instead of LEDs.Similarly, either an OLED of specific wavelength or an OLED of whitelight can be implemented, according to different embodiments of theinvention.

Another option is to use simple and traditional white light sources likeminiaturized halogen lamps (which demand more electrical power). In suchan embodiment, a second battery can be used.

The light source power supply of the different described elements of theprojector comprises one or more batteries and one or morevoltage-to-current converters. For example, the laser diodes of thelight sources according to different embodiments of the invention can besupplied with 3 button cells. Calculations indicate that the operationtime on one battery while using the device three times per day (forexample before each meal) can last up to 10 to 15 days. In someembodiments, the whole power supply can be placed in the same unit withlaser sources and objective (FIG. 16A, 1/3). It can also be a removableor a releasable or an attachable battery.

In one embodiment, ultra-low power electronics are used, especiallymicrocontrollers which control the operation of the external lightsource electrical components. For example mixed-signal microcontrollers(e.g. MSP) can be used. Such boards allow to keep devices in a hot mode(analogues to a sleeping mode for computers) and at the same time toconsume extremely low current.

In another embodiment, special electrical and/or electronic switcherscan be used. These cut the components of the light pattern source fromthe power supply during standby mode.

Long life rechargeable batteries with high energy capacity can also beused.

The objective (FIG. 16A, 1) of the device forms the light pattern andprojects it onto the investigated surface (e.g. food). Projection coversthe whole field of view (or field of vision or FOV) of the mobile phonecamera. In one embodiment, the optical part of the objective may consistof diffraction elements that form the light pattern: For example, thisoptical part can be a zone plate (amplitude or phase, depending on thetype of light source being implemented), and additional lenses can bemounted in order to achieve acceptable light pattern projectioncharacteristics.

The position of the device on the mobile phone or mobile device can bevariable from one measurement to another. In such a case, the apertureof light pattern projection can exceed the aperture of the mobile phonecamera objective, in order to cover the field of vision of the mobilephone camera (and this in all possible positions of the attachedexternal light source projector on the mobile device).

Alternatively, the position of the device on the mobile phone can bedetermined or fixed from one measurement to another. In such anembodiment, the aperture of the light pattern projection can bepractically the same as the aperture of the mobile phone camera. Thisembodiment allows forwarding all the light from the external lightsource to the field of vision of the camera of the mobile phone. In thecase of the fixed position the extern device is attached by matching atleast some external parts of the mobile phone with at least someexternal parts of the device according to certain designs. For example,such position can be achieved by matching the photodiode (resp. cameraaperture) of the device with the electronic flash (resp. cameraobjective) of the mobile phone (FIG. 16C). To set up a device to a fixedposition, the photodiode and the camera aperture should be matched withthe mobile phone flash and the camera objective as shown on FIG. 16C. Inthe fixed position of the device, the photodiode (FIG. 16C, 2) triggersthe illumination of external light source depending on some mobile phonecamera events.

FIG. 16D shows the mobile phone with (i) and without (ii) externalpassive optical element. In FIG. 16, D1 represents the output passiveoptical element that forms the light from the mobile phone flash intothe pattern of predefined shape, FIG. 16, D2 mobile phone flash, FIG.16, D3 mobile phone camera aperture. In this embodiment, the devicecomprises: an aperture for mobile phone camera (FIG. 16D, 3), aperturefor the mobile phone electronic flash (D2) and (D1). The light source ofmobile phone flash represents a white light LED. This gives theopportunity to use it for structured light 3D reconstruction. In thiscase the device represents simply the passive (or active) opticalelement that is matched with mobile phone flash. This optical elementforms the light of the flash into the pattern of redefined shape inanalogy to the previous description. As an element a phase or zone platecan be used. Besides some additional lens for concentrating the lightfrom the flash can be installed before or after the zone plate (FIG.16D, 1). Additional optical elements can be added here to project lightpattern into the field of view of the mobile camera. In someembodiments, the aperture of the mobile phone camera may not have anyoptical elements. Alternatively, the device may have a variable positionon mobile phone from one measurement to another. According to anotherembodiment, it can be placed on the mobile phone at a fixed positioneach time. This is achieved by matching the features between frames ofthe mobile phone and the external device.

The orientation of the light pattern structure inside the field ofvision of the phones camera can be predefined or it can be not fixed. Inboth cases different algorithms are applied. The difference inalgorithms is that in case of fixed orientation the scans are carriedout in the direction of pattern orientation. In case of unknownorientation the scans are carried out in all directions.

Embodiments of the invention also enable the use of one single source oflight, instead of two sources of light projecting specific lightpatterns as can be observed in known industrial systems. In particular,this single source of light can be provided as an external device with alaser irradiation unit or in simpler embodiment by reusing an existingmobile phone flash.

In some embodiments of the invention, powerful light sources (e.g. alaser) can be used to provide enough and high contrast in the reflectedlight, in order to detect the deformed light pattern or pattern zone. Insome other and preferred embodiments, less powerful light sources areused. Dimensions of the illumination unit is decreased and energymanagement is optimized (this is of particular advantage if embodimentsof the method are implemented in a safety-critical device, for examplein the remote control of an insulin pump).

In some embodiments, extern light laser sources are used. AlternativelyLED sources and/or embedded flash lights of a mobile phone can be usedto project the light pattern on to the object of interest (e.g. platewith meal).

FIG. 17A and FIG. 17B illustrate other implementations of the stepsaccording to the invention. One or more described steps and/oroperations and/or the described components of the system (projector orillumination, camera, video camera etc) can be implemented in variousways. The components can be in physically separated devices for example.Or the different components can be embedded in one unique device, forexample a mobile phone. Alternatively, one or more components can beimplemented as extension or add-on or accessory or mountable devices.For example, the projector or light pattern projection component can bereleasably attached or rigidly secured to a standard (un-modified)mobile phone. Securing or locking means (e.g. latches, hooks, anchors,magnets, adhesives, etc) can enable a fixed or moveable or releasable orattachable placement or mounting or attachment or connection of thelight pattern projection component to the mobile phone. In such a case,large scale markets are enabled because unmodified phones can be used.Mobile phones are commonly provided with image acquisition capabilitiesand with accelerometer or motion sensors. In some other embodiments, oneor more of the described systems or components and/or correspondingmethods or steps or operations can be implemented in specific devices(for example in glucometers provided with image acquisition andaccelerometer/motion sensor means). In some other embodiments, a remotecontrol for an insulin pump is used. New generations of insulin pumpremote controllers sometimes include a camera suitable for the captureof still images or of videos. Such acquisition means can also be used toread bar codes providing data on ingredients associated with food forexample. Such remote controls or glucometers also can be provided withaccelerometer. The addition of light pattern projection components orprojectors or pico-projectors or lasers or lightning means, nativelyinside said devices or as external and connectable additional devicesenables to implement aspects of the invention.

Video embodiments are now discussed.

The described methods can indeed leverage video capabilities of mobilephones, insulin pump controllers or next-generation of glucometers. Thespeed of image processing for image streams is limited by the refreshingfrequency of the stream itself, i.e. the processing tasks can only go asfast as images are received. Higher refresh (frame) rates in mobiledevices have become available. In most of the modern smart phones on themarket, the frame rate is 25 frames per second (fps), while the somemodels support up to 30 fps. Access to the embedded camera using lowlevel access can permit to use higher frame rates. The range of framerates itself is limited only by the computational speed and the chargeaccumulation properties of the photo sensors (charge-coupled devicecells or CCD). The charge-accumulation time is the necessary time for asingle pixel to gather enough light and return a measurement. Thisminimum charge delay limits the speed of image acquisition using siliconstructures to roughly 200 fps under standard light conditions (in 2013).In the described embodiments, the speed of image acquisition is incorrespondence with the switch rate of the external light patternsource. In mobile phone embodiments, this rate is in relation with themobile phone flash lamp. In some estimations, according to currentavailable technologies, time of image acquisition speed can vary in therange of 100 to 800 milliseconds, corresponding to a fps of 1 to 10. Inother words, one or more light pattern projections and one or more imageacquisitions can occur in a short timeframe, for example one second.

In some embodiments, the image acquisition of the object can be carriedout using the video mode of the smartphone or the mobile device. Asoftware application triggers the light pattern projector source and atthe same time launches the video recording mode. After a predefined timedelay, the light source is turned off and the mobile device continues torecord for a period of time equal to a certain predefined time. Tooptimize the computation, at least two video frames are acquired (onewith and one without the projected light pattern). In some embodiments,the sequence consisting of turning projection on and off is repeated andimages are continuously acquired. One or more couple of images are thenselected for optimization purposes. Selection criteria comprisequantification of differences in camera positions and orientations, timedelays or periods, quality of images or resulting extraction of thelight pattern. In some embodiment, the selection minimizes the timedelay and/or maximizes the light pattern intensity between frames. Toidentify the images with and without light pattern; the time of theevent of light source switch on/off can be recorded, and/or thetransition between frames with light pattern and those without can bedetected. To recognize the images with and without light pattern on it,another optional step can occur, said sub-step consisting in comparingthe histograms of subsequent images. In other optional embodiments,parameters such as time delay, frame rate or other hardwarespecifications can be used to discriminate between the two groups ofimages (with and without the projected light pattern). After thedetermination in the video sequence of the couple of images with andwithout the projected light pattern, one or more best couple of imagescan be determined, for example with one in each group. The best pair orcouple of frames can be chosen using the minimal necessarytransformation to remove shake and movement for example. After thedetermination of the optimal pair of frames, the other steps of themethods presently described can be carried out.

In further developments, the computations according to one or more ofthe described steps or embodiments of the invention can be executedcontinuously and optional advices (or mandatory instructions) forimproving the acquisitions of images are dynamically displayed to theuser (for example with arrows displayed on the screen to suggest todisplace the mobile phone, to center the object on the screen, to changethe angle view or the like). Image acquisitions also can be triggeredautomatically (for example without the user being required to press abutton or to touch the screen). For example, such adaptive imageacquisition advices or other embodiments of the invention can berepeated until the acquisition of images is considered as sufficientbased on predefined thresholds associated with criteria comprising oneor more of image quality, associated measurements of handshakes, timedelays between still images or video frames, resulting computed lightpattern or a combination thereof.

Embodiments of the present invention can take the form of an entirelyhardware embodiment, an entirely software embodiment or an embodimentcontaining both hardware and software elements. Software includes but isnot limited to firmware, resident software, microcode, etc. A hardwareimplementation may prove advantageous for processing performances.Furthermore, some embodiments of the present invention can take the formof a computer program product accessible from a computer-usable orcomputer-readable medium providing program code for use by or inconnection with a computer or any instruction execution system. Acomputer-usable or computer-readable apparatus can be any apparatus thatcan contain, store, communicate, propagate, or transport the program foruse by or in connection with the instruction execution system,apparatus, or device.

What is claimed is:
 1. A system for estimating carbs content of food,said system comprising instructions which when performed by a processorcauses the system to operate to: identify a movement of an imageacquisition component between a first image of the food and a secondimage of the food, the second image having been acquired with the foodhighlighted with a projected light pattern; correct the first imageand/or the second image by an operation which compensates movements ofthe image acquisition component throughout an image acquisitionoperation based on artifacts to get first and second images of acorrected pair of images; subtract the first and second images of thecorrected pair of images; identify the projected light pattern basedupon the subtraction of the first and second images of the correctedpair of images; compute the tridimensional shape and the volume of thefood given the deformations of the projected light pattern; makesegmentation and recognition of the food from the first image; andcompute the carbs content of the food based upon the volume,segmentation and recognition of food via use of nutritional databases.2. The system if claim 1, further comprising a light pattern projectioncomponent to provide the projected light pattern on a surface of thefood.
 3. The system of claim 2, wherein the irradiation power of thelight pattern projection component is inferior to 5 mW and operatescontinuously, or in impulse mode with pulse frequencies between around0.5 Hz up to 10 Hz.
 4. The system of claim 2, wherein the light patternprojection component comprises one or more light sources chosen from thelist comprising a low intensity semiconductor LASER diode, a LED, anorganic LED (OLED), a pre-existing mobile phone flash such as a whiteLED or a miniaturized halogen lamp or a combination thereof with anelectrical power consumption less than 0.7 W.
 5. The system of claim 2,wherein the light pattern projection component comprises a light sourceand an optical objective adapted to form and project and/or to focus theprojected light pattern onto the food, wherein the power density isinferior to the predefined value of 55 mWt/cm².
 6. The system of claim2, wherein the relative pose and orientation of the light patternprojection component and the image acquisition component is staticduring image acquisition or light pattern projection.
 7. The system ofclaim 1, wherein power density for the projected light pattern isinferior to 55 mWt/cm².
 8. The system of claim 1, wherein the projectedlight pattern is composed of geometrical motifs such as sequences ofstripes, or dots, or repetitive graphical elements or of a combinationthereof.
 9. The system of claim 1, wherein the projected light patternis coded by color and/or phase and/or amplitude modulation.
 10. Thesystem of claim 2, wherein the projected light pattern is coded by colorand/or phase and/or amplitude modulation, and wherein the coding of theprojected light pattern is predefined and is synchronized with the imageacquisition component and data processing.
 11. The system of claim 1,wherein the compensation to the movements of the image acquisitioncomponent throughout the image acquisition operation is performed byprocessing data received from a motion sensor.
 12. The system of claim1, wherein the compensation of the movements of the image acquisitioncomponent throughout the image acquisition operation is performed bymulti-view geometry methods, by projective warping, by piecewise linear,projective, or higher order warping, by deconvolution, by oriented imagesharpening, or by optical flow detection before or after or in aniterative refinement process with the subtraction of the images.
 13. Thesystem of claim 1, wherein the food segmentation and recognitioncomprise one or more of the operations of segmenting the image,identifying color and/or texture features of segmented parts of theimage and performing machine learning based classification for one ormore segmented parts of the image or a combination thereof.
 14. Thesystem of claim 13, further comprising instructions which when performedby a processor causes the system to operate to estimate one or more mealcharacteristics of the meal captured in the first or second image bymultiplying the estimated volumes of the determined food types byunitary volumetric values retrieved from a database, said database beingaccessed from the Internet, and/or stored locally on the device, and/ordetermined from food labels by using OCR and/or associated withgeolocation data and/or provided by the user.
 15. The system of claim13, wherein the one or more characteristics, of the meal or of partsthereof, are one or more of carbs content, fat content, protein content,Glycemic Index (GI), Glycemic Load (GL) and/or Insulin Index (II) or acombination thereof.
 16. The system of claim 14, further comprisinginstructions which when performed by a processor causes the system tooperate to provide an insulin dose recommendation and/or a bolus profileadvice based on said one or more meal characteristics.
 17. The system ofclaim 1, wherein the system is configured to run automatically andproject the light pattern via a voice command or a gesture commandand/or by a touchscreen command and/or by a geo position and/or byfollowing a predefined time schedule.
 18. The system of claim 2, whereinthe image acquisition component and the light pattern projectioncomponent are embedded in a mobile device, a glucometer, an insulinpump, a mobile phone or a smartphone.
 19. The system of claim 2, whereinthe light pattern projection component is attachable to a mobile devicewhich embeds the image acquisition component via a clip or via insertioninto an electrical contact slot, a power charge slot or a USB slot. 20.The system of claim 1, wherein the first or the second image are basedupon a video frame.
 21. The system of claim 1, wherein one or moreoperations of the system are continuously automatically repeated untilthe acquisition of the first and/or the second image is considered assufficient based on predefined thresholds associated with criteria thatcomprises one or more of image quality, associated measurements ofhandshakes, time delays between still images or video frames, resultinglight pattern or a combination thereof.
 22. The system of claim 1,wherein the image acquisition component is embedded in a handheld deviceand has a sensitivity between around 0.3 up to around 3 lux.
 23. Thesystem of claim 2, wherein the light pattern projection component has anelectrical power consumption between 0.05 W and 0.7 W.
 24. Anon-transitory computer readable medium encoded with an informationprocessing program for use in an information processing device, saidinformation processing program comprising instructions which whenexecuted by a processor in the information processing device causes theinformation processing device to perform the operations of claim
 1. 25.A non-transitory computer product comprising the non-transitory computerreadable medium according to claim
 24. 26. A method of utilizing asystem for estimating carbs content of food, said system comprising amobile device with a processor and instructions which when executingcause the processor to operate according to claim 1, the methodcomprising: capturing a first image of the food via the mobile device;capturing a second image of the food via a mobile device, the secondimage having been acquired with the food highlighted with a projectedlight pattern from the mobile device; and executing the instructions viathe processor of the mobile device.
 27. The method of claim 26, whereinthe mobile device is selected from a glucometer, an insulin pump, amobile phone and a smartphone.
 28. The method of claim 26, wherein asoftware application on the mobile device triggers a light source whichhighlights the food with the projected light pattern and at the sametime launches a video recording mode which captures the first and secondimages of the food.
 29. The method of claim 28, wherein after apredefined time delay, the light source is turned off by the softwareapplication and the mobile device continues to record video for a periodof time equal to a certain predefined time.
 30. The methods of claim 29,wherein the software application runs a sequence consisting of turningthe light source on and off repeatedly while images are continuouslyacquired by the mobile device, and wherein one or more of the images arethen selected for optimization based upon one or more of differences incamera positions and orientations, time delays or periods, quality ofthe images, resulting extraction of the light pattern, the images withminimum time delay, and the images with maximum light pattern intensitybetween frames.